MIXING FOR PROGRESSIONS IN NON-ABELIAN GROUPS 
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Abstract. Wc study the mixing properties of progressions {x,xg,xg^), {x,xg,xg^ ,xg^) 
of length three and four in a model class of finite non-abelian groups, namely 
the special linear groups SLii{F) over a finite field F, with d bounded. For 
length three progressions {x, xg, xg^), we establish a strong mixing property 
(with error term that decays polynomially in the order \F\ of F), which among 
other things counts the number of such progressions in any given dense subset 
A of SLii{F), answering a question of Gowers for this class of groups. For 
length four progressions {x, xg, xg^ , xg^), we establish a partial result in the 
d = 2 case if the shift g is restricted to be hyperbolic, although in this case 
we do not recover polynomial bounds in the error term. Our methods include 
the use of the Cauchy-Schwarz inequality, the abelian Fourier transform, the 
Lang- Weil bound for the number of points in an algebraic variety over a finite 
field, some algebraic geometry, and (in the case of length four progressions) 
the multidimensional Szemercdi theorem. 



1. Introduction 

Let G = (G, •) be a finite group, not necessarily abelian. Given a natural number 
A; > 1 and k functions /o, . . . , fk-i : G — ^ C, wc define the fc-linear form 

k-l 



A/c,g(/o, ■ • ■ , fk-i) ■■= E^,geG n Mx9' ^) 
where E denotes the averaging notation 



1=0 



Eb/ := E,eij/(a;) E /(^) 

' ' xeE 

for non-empty finite sets E and complex-valued functions / on E, with \E\ denoting 
the cardinality of the set E. Thus, for instance, if A is a subset of G, with the 
associated indicator function l^i : G — > {0, 1}, Ak^oi^A, • • ■ , Ia) denotes the number 
of (possibly degenerate) length k geometric progressions {x,xg, . . . ,xg'^~^) in A, 
divided by |G|^ 

The form Ak^c is easily computed for /c = 1, 2: 

Ai,g(/o) = Eg/o 
A2,g(/o,/i) = (Eg/o)(Eg/i). 

Now wc turn to the fc = 3 case. If /o, /i , /2 are selected in a sufficiently "random" 
fashion, then probabilistic heuristics suggest that one has 

(1.1) A3,g(/0,/i,/2) « (Eg/o)(Eg/i)(Eg/2) 



1991 Mathematics Subject Classification. 11B30, 20D60. 

1 



2 



TERENCE TAO 



and more generally 

fe-i 

(1.2) AkMf0,■■■Jk-l)~YlEGf^. 

i=0 

However, if G has a non-trivial low-dimensional unitary representation p : G ^ 
Ud{C) for some small d, then it becomes possible to violate the heuristic 
Indeed, if one lets _B be a small neighbourhood if the identity in Ud{C), and sets 
B' to be the slightly larger neighbourhood 

B' ■.= B- B-^ ■ B := {bib^^h ■■ hxMM S B}, 

with the associated preimages A := p^^{B), A' p^^{B'), then from the identity 

P{xg'^) = p{xg)p{xy^p{xg) 

we see that xg^ e B' whenever x, xg G B. In particular, we have 

A3,g(U, Ia, U') = A2,g(U, Ia) = (EgU)(EgU), 

which violates p.ip if i? (and hence B') is small enough; if the dimension d is 
small, this can be done with a relatively large value for the density EqIj^. A 
similar argument applies to exhibit a deviation from (jl.2p for any fc > 3. 

The deviation from (jl.ip is most pronounced in the case when G is abelian (so 
that all irreducible unitary representations of G are in fact one-dimensional). In 
this case we will switch to additive notation and write the group operation of G as 

so that 

(1.3) A3,g(/o,/i,/2) :-E,^geG/o(x)/i(a: + g)/2(x + 2g). 

The analysis of this form usually begins by introducing the Fourier transform 

/(O E,eG/(a:)e(-e • x) 

for all ^ in the Pontryagin dual G of G, defined as the space of all homomorphisms 
^ : X 1-^ ■ X from G to the (additive) unit circle R/Z, where e{x) := e^'^*^; of 
course, G is encoding the irreducible one-dimensional unitary representations of G 
mentioned previously. Using the Fourier inversion formula 

fix) :=^/(Oe(e.:r) 
one soon arrives at the useful identity 

A3,g(/0,/i,/2) = /o(0/l(-20/2(0 

?eG 

relating the magnitude of A^3,G{fo: fi, f2) with the size of the Fourier coefficients 
of /o 1/15/2- Note that the heuristic (|l.ip corresponds to the ^ = term in this 
sum; the point is that the non-zero frequencies ^ ^ can also give a significant 
contribution. 

Using the above identity, one can eventually establish the Roth- Varnavides- 
Meshulam theorem 

(1-4) A3,g(U,U,U) >C3(5) 

for any < (5 < 1, any finite abelian group G, and any subset A d G with 
\A\ > 6\G\, where 03(6) > depends only on S; see e.g. [281 Theorem 10.9]. In a 
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similar vein, we have the deep theorem of Szemeredi |25j , which implied the more 
general lower bound, 

(1.5) Afc,G(U,...,U) > cfc(<5) 

for all fc > 1 and < 5 < 1, any finite abelian groups G, and any A C G with 
1^1 > ^\G\, where Ck{S) > depends only on k and S. 

Remark 1.1. More explicit bounds for €3(6) are known. For general abelian groups 
G, an argument of Bourgain ^ gives C3 {5) > cS^'/^' for some absolute constants 
c,C > 0; see e.g. [281 Theorem 10.30]. In the case when G is a cyclic group, the 
strongest bound to date is due to Sanders '2'6\ , who (in our notation) established 
that C3((5) > 0(5*^'°^ iiiQ other hand, in this case one also has the upper 

bound cz{5) < C'(5'=i°g(i/'5) due to Behrend [3J. When G is a vector space over a 
fixed finite field F of odd order (such as F^), the best bound is due to Bateman 
and Katz [2], who established c^{5) > exp(— C(5'^~^) for some constants C, c > 
depending only on F . For k > 3 and for cyclic groups, the explicit bounds known 
are weaker: for k = 4, the results in [9J give C4^{S) > cexp(— C(5~'^'°s(i/'5)^^ while for 
higher k, the results in [llj give Ck{S) > exp(exp(— Cfci5~'^'=)) for some constants 
Ck,Gk > depending on k; in the other direction, a modification of the Behrend 
construction J20j gives Ck (6) < CkS''^^°s''Hi/s)_ general groups, explicit lower 
bounds on Ck{S) are known thanks to the recent quantitative work on the density 
Hales-Jewett theorem [18j or the hypergraph removal lemma [12j . |21j . [22], [26j . 
but the bounds are rather poor. 

Now we turn to the case when G is not necessarily abelian, and in particular 
in the quasirandom case in which G has no low-dimensional representations. More 
precisely, following Gowers [12], call a finite group G D -quasirandom if the only 
irreducible unitary representations p : G ^ Ud{C) have dimension d greater than 
or equal to Z?. A model example of quasirandom groups are provided by the special 
linear groups over a finite field: 

Proposition 1.2 (Quasirandomness of special linear group). Let d > 2 be an 

integer, and let F be a finite field. Then the group SLd{F) of d x d matrices with 
coefficients in F of determinant one is Cd\F\'^^^ -quasirandom, for some Cd > 
depending only on d. 

Proof. This follows from the results in [TS]. The case when d ^ 2 and |F| has prime 
order is classical, dating back to the work of Frobenius. Similar results hold for 
other finite (almost) simple groups of Lie type and bounded rank; see [15] . □ 

When D is large, one expects better mixing properties in the forms Ak^c- To 
illustrate this, we introduce the variant expressions 

k-l k~l 

E.,eGl[f^{xg"')-l[EGf^ , 

i=0 i=0 

which controls the number of length k progressions for a single (generic) shift g, as 
opposed to the average number over all such g. This expression vanishes for fc = 1, 

^Strictly speaking, the original theorem of Szemeredi only treats the case when G is a cyclic 
group, but subsequent proofs of Szemeredi's theorem (such as the hypcrgraph-based proofs in |13| , 
| 21| . |22| . |26p allow for one to handle arbitrary abelian groups G. 



Afe,G(/o: • ■ ■ 1 fk-i) EgeG 
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but can be non-trivial for fc > 1. From the triangle inequality we have 

(1.6) |Afc,G(/o,...,/fc-i)- n^G/.l < A^,g(/o,---,A-i) 

and so the heuristic (jl.2p holds whenever ^(/o, . . . , //c-i) is small. However, 
when one has a low-dimensional representation p : G — > Ud{C), it is possible 
for q(/o, . . . , /fe-i) to be large even when (|1.2|) holds. Consider for instance the 
k = 2 case, in which (|1.2I) holds exactly. If we let B be a small neighbourhood of the 
identity in Ud{C) with preimage A := p^^{B) as before, and sets A' := p^^{B^^-B), 
we see that lA{x)lA{xg) vanishes whenever g ^ A' , and thus 

Kci^A, U) = ^geG\^xeGlA{x)lA{xg) ~ (EgIa)'I 

can be lower bounded by (EgI^)'^(I — EgI^/), which can be somewhat large if B 
is chosen small enough, and d is small. 

As observed first by Gowers [TJj , though, Aj q becomes much smaller in the 
quasirandom case. Indeed, we have the inequality 

(1-7) ll/l * f2\\mG) < D-^/^\G\\\hU.(G)\\h\\LHG) 

for any Z3-quasirandom group G and any functions /i , /2 : G — !■ C with at least 
one of /i, /2 having mean zero, where 

||/|U.(G) {-E,^g\I{x)\^Y'^ 

and * denotes the discret^ convolution 

/i * h{x) := h{v)f2{v~'x) = ^ h{xy-')f2{yy, 
veG yeG 

see [1] or f?, Proposition 3]. Note that ()1.7p improves by a factor of D~^/'^ over the 
trivial bound of |G|||/i||i2((3)||/2||L2((5-) arising from the Young and Cauchy-Schwarz 
inequalities. 

The estimate (|1.7p has the following useful corollary: 

Lemma 1.3 [k ~ 2 mixing for quasirandom groups). If G is a D-quasirandom 
group, then 

AlGihJ2)<D-'^'\\fl\\LHG)\\f2\\LHG)- 

Proof. Observe that the expression Aj ^(/i, /2) does not change if /i or /2 is mod- 
ified by an additive constant. Thus we may normalise /i and /2 to both have mean 
zero. We can then write 

A;,g(/i,/2) = ^geGfo{9)E^GGfi{x)f2{xg) 

for some function /o : G ^ C of magnitude 1. The right-hand side can be rewritten, 
after a change of variables, as 

|^Ej,eG(/o */i)(y)/2(2/). 

The claim then follows from ()1.7p and the Cauchy-Schwarz inequality. □ 

^Ordinarily, one would normalise this eonvolution by for compatibility with the averaging 
in the L^{G) norm, but it will be convenient to use the discrete normalisation because we will be 
passing from a group G to various subgroups of G in subsequent arguments. 
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In [12] , Gowers posed the question of whether resuhs such as Lemma 11.31 could 
be extended to higher values oi k, so that the heuristic (jl.ip or ()1.2p could hold 
for sufficiently quasirandom groups. We were not able to settle this question in 
general, but in the fc = 3 case we can afhrmatively answer the question for a model 
class of quasirandom groups, namely the special linear groups SLd{F) over a finite 
field F: 

Theorem 1.4. Let F be a finite field, and set G := SLd{F) for some d >2. Then 
we have 

2 

|A^,g(/0,/i,/2)I \F\-^^''^^'-'^'^/'l[mL^iG) 

i=0 

for all functions /o,/i,/2 : G — > C, where ||/||l°o(g) ■— sup^.^^ |/(a;)| . Here and 
in the sequel we use Y <^d X , X Y , or Y — Od{X) to denote the estimate 
\Y\ < CdX for some Cd depending only on d, and similarly with d replaced by 
other sets of parameters. In particular, from ()1.6|) one has 

2 

A3,g(/0, h.h) = (Eg/o)(Eg/i)(Eg/2) + 0,(|Fr--('^-l-2)/8 [] 

Theorem 11.41 is proven primarily through application of the Cauchy-Schwarz 
inequality and Lemma [L3l we give this proof in Sections [2]|4l The key point is that 
the non-abelian nature of G means that the application of Cauchy-Schwarz creates 
more averaging than is seen in the abelian case. The exponent min((i — 1, 2)/8 is 
unlikely to be optimal. By taking /q. A, /2 to be constant on left cosets gH of a 
proper subgroup of H and of mean zero, we see that one cannot replace the quantity 
|^|- mm(d-i,2)/8 anything much smaller than |iJ|/|G|; in particular, if we take H 
to be the Borel subgroup of upper-triangular matrices in G, we see that one cannot 
replace min(d — l,2)/8 by any exponent greater than , it is likely that one 

can extend Theorem 11.41 to other finite simple groupf0 of Lie type with bounded 
rank, but we will not do so here. 

Applying Theorem 11.41 to indicator functions /o = 1a, /i — 1b,/2 — Ic and 
using Markov's inequality, we conclude in particular the "weak mixing" bound 

^i{A nBgn Gg') = fi{A)^i{B)f,{C) + o,(|f|- ™"('^-i'2)/i6) 

for a proportion l-Orf(|F|-"""(''-i^2)/i6) of g G G, where fi{A) := EgU = |^|/|G| 
denotes the density of A in G. 

We conjecture that Theorem 11.41 can be extended to higher values of k than 
= 3 (possibly with a smaller exponent than mm{d — 1, 2)/8). Unfortunately, the 
Cauchy-Schwarz argument does not seem to extend beyond fc = 3; in contrast to the 
abelian case, in the non-abelian setting it appears that when fc > 3, each application 
of Cauchy-Schwarz increases the complexity of the resulting form, rather than 
decreasing it as in the abelian case. However, we are able to establish the following 
weak partial result in the fc = 4, d — 2 case, in which the shift g is restricted to be 
hyperbolic: 



^To be pedantic, SL^{F) is usually not a simple group, due to its nontrivial centre; but it is 
a bounded cover of a finite simple group, namely PSL^{F). Note that the results for SLj^{F) in 
this paper automatically descend to the quotient group PSLii{F) without difficulty. 
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Theorem 1.5. Let F be a finite field, and set G := SL2{F). Let S denote all the 
elements of G which are hyperbolic in the sense that they are diagonalisable over 
F. Then for all functions /o, /i, /2, /s ■ G — > C, one has 



where 0\F\^ociX) denotes a quantity bounded by c{\F\)X for some quantity c{\F\) 
that goes to zero as \F\ goes to infinity. 

It is easy to show that for large S has density about 1/2 in G; see SectionlH 
The main reason why the shift g is restricted to S in our arguments is in order to 
ensure that g is contained in a non-trivial metabelian subgroup of G; for instance, if 
5 is a diagonal matrix with entries in F ^ then it is contained in the Borel subgroup 
B of upper triangular matrices in G. The argument is rather ad hoc in nature, 
combining Cauchy-Schwarz and the abelian Fourier transform with some explicit 
nonabelian effects coming from the algebraic structure of progressions in the Borel 
group. It also relies on (a quantitative version of) the multidimensional Szemeredi 
theorem of Furstenberg and Katznelson '5^ , which is the reason for the poor decay 
in \F\. Finally, to pass from the Borel subgroup back to the full group, an expansion 
result in SL2{F), related to the Bourgain-Ganiburd expansion theory in this group, 
is also required. 

Remark 1.6. The results in this paper concern the mixing properties of the pat- 
terns {x,xg,xg'^) and {x,xg,xg^,xg^) for an explicit class of quasirandom groups, 
namely the special linear groups. In a recent paper with Vitaly Bergelson [4], we 
also establish some mixing properties for the patterns {x,xg,gx) and {g, x, xg, gx) 
in arbitrary quasirandom groups. While the end results of both papers are super- 
ficially similar in nature, the proof techniques turn out to be completely different, 
with the results in |4j relying on nonstandard analysis, the triangle removal lemma 
from graph theory, and ergodic theorems involving idempotent ultrafilters. In both 
cases, the methods are tailored to the specific patterns being counted, and it appears 
we are still quite far from a general theory that can cover all nonabelian patterns 
involving two or more variables such as x,g. 

We also remark that in [27 , some mixing properties of patterns of the form 
{x,y, P{x,y)) were established when P . G x G ^ G was a definable function over 
a finite field of large characteristic. However, the arguments in that paper (which 
also involve Cauchy-Schwarz, but applied in a slightly different fashion) required 
{{P{x,y), F{x,y'), P{x' ,y), P{x' ,y')) : a;, y G G} to be sufficiently Zariski dense in 
G^ . This is not the case for the pattern {x,xg,xg^) (in which P{x,y) := yx~^y), 
since P{x, y) and P{x, y') are necessarily conjugate to each other. 

1.7. Acknowledgments. The author was partially supported by a Simons Investi- 
gator award from the Simons Foundation and by NSF grant DMS-0649473. He also 
thanks Vitaly Bergelson for many stimulating discussions regarding these topics. 
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1=0 i=0 



2. A GENERAL BOUND FOR A; 



3,G 



Let us define the reduced spectral norm \\fJ-\\s{G) of ^ function /i : G — > C to be 
the best constant such that 



\\f*f^\\L^G) < 1Im1Is(G)II/I|l2(g) 
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whenever / : G — !■ C has mean zero, thus 

(2.1) |E,eGA(z)(/2*^)(z)| < ||mII5(G)II/iIIl^(G)II/2||l2(g) 

for ah /i,/2 : G — ?> C, as can be seen by spHtting /i,/2 into constant and mean 
zero components, and noting that all cross terms vanish. 

Remark 2.1. From the Peter-Weyl theorem, one can also write \\tJ-\\s{G) o-S 

IImIIs(g) = sup II f^ig)pig)\\op 

" gee 

where p : G ^ U{V) ranges over all non-trivial irreducible finite- dimensional uni- 
tary representations of G. We will not make much use of this representation- 
theoretic interpretation of the reduced spectral norm here, although we remark that 
this interpretation can he used to derive the basic quasirandomness inequality (|1.7p 
(or (1131) below). 

The reduced spectral norm ||/i||5(G) is clearly a seminorm, and in particular obeys 
the triangle inequality. From Minkowski's inequality, we have the crude bound 

(2.2) yWsio) < yWiHG)- 
From ()1.7p we also have the more refined estimate 

(2.3) MsiG)<D-'^'\G\'/'Me^G) 

when G is _D-quasirandom. If we split /i into the region where fi{x) > Go/|G|, and 
the region where /i(x) < Go/|G|, for some threshold Go > 0, and apply (|2.2p to the 
latter and (j2.3p to the former, we conclude that 



(2.4) ll/^ll5(G) < Coi^-'/' + J2 '^(^)- 

xeG:p.{x)>Co/\G\ 

By combining these estimates with the Cauchy-Schwarz inequality, we can obtain 
the following general bound on the quantity A3(/o, /i, /2). 

Proposition 2.2. Let G ~ (G, .) be a D-quasirandom group for some D > 1. Let 

Cq > 1 be a parameter. Then we have 

(2.5) 

GoZ?-i/2+Eb,,eG E l^bAy)] nil/.||L~(G) 

yeG-4Lt,hiv)>c„/\G\ J 1=0 

for all functions /o, fi, f2 '. G ^ C, where for each b,h £ G, fit^h : G ^ C is the 
function 

(2.6) fib,h EggGEcg2(f,)(5gc-i/i-ig-ic-ih-i 

where Z{b) :— {c £ G : cb — be} is the centraliser ofb. 

One can view fib^h as a probability measure on G, describing the distribution 
of the random variable gc^^h~^g^^c^^h~^ when g is a randomly chosen element 
of G, and c is a random element commuting with b. The estimate (|2.5p becomes 
useful when fXh^h is approximately uniformly distributed over G for typical 6, h, so 
that J2y^G:f.,,Uv)>Co/\G\l^bAy) is smah. 
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Proof. When /o is equal to a constant c, we have 

KGifoJlj2) = \c\AlGihJ2) 



and the claim then follows from Lemma 11.31 As A3 q is sublinear in each of the 
three arguments, we may thus assume that /o has mean zero. We then also assume 
that /o, /i, /2 are real- valued, and normalise so that 

I/oIIl°°(g) = II/i||l=(g) = II/2||l==(g) = 1- 

Our task is now to show that 

veG:t^bM{y)>Co/\G\ 

Ever since the work of Gowers [10 , it has been is common to control expressions 
such as A3 q(/o, /i, via the Cauchy-Schwarz inequality. In the literature, this 
was mostly performed in the abelian case, but one can obtain a useful estimate via 
Cauchy-Schwarz in the non-abclian case too. First, we shift x by to obtain 

AS,g(/o,/i,/2) - EgeG\E.,^Gfo{^g-')fi(x)f2{x9)\ 

which we expand as 

A1g(/o,/i,/2) = I^^eGh{x)iEgeGfo{x9-')f2ixg)f3{g)) 

for som^l function : G ^ C bounded in magnitude by 1. Applying Cauchy- 
Schwarz in X to eliminate /i, we obtain 

A1g(/o,/i,/2) < iE^eG\EgeGMxg'')f2{xg)Mgry^\ 

We can expand the right-hand side as 

{E,,g,g'eGMxg-')Mx{gT')f2ixg)f2ixg')f3ig)f3{g')y/^. 

Making the change of variables {y,g,a) := (xg, g, g^^g'), this becomes 

iEy,g.aeGfoiyg-')foiyg-'a-'g-')f2(.y)f2{ya)Mg)f3iga))'/\ 

If we define Aaf{y) := f{y)f{ya), this becomes 

(E,,,eGA,/2(2/)(E<,eGA,,-i,-i/o(2/g-2)A,/3(5)))'^'. 
Applying Cauchy-Schwarz in y,a to eliminate Aa/2, we thus have 

A3,g(/o,/i,/2) < (E,,„eG|E,eGA,,-i,-i/o(yg-2)A,/3(5)n^/^ 
The right-hand side can be expanded as 

{Ey,a,g,g'eG\a-^g-^fo{yg~^)\'a~Hg')~^foiyigT')^af3{g)Aaf3{g')y^^- 

Making the change of variables {z,b,g,h) :— {yg^^,ga^^g^^,g,g'g^^), we conclude 

the inequality 

(2.7) 

|A3,g(/o,/i,/2)| < (E,^b,<,,heGAf,/o(z)A^b^-i/o(z5/i"'5''/i"')A„-ib-ij3(5)A„-ib-i„/3(M)'/'- 



^If one is only interested in bounding A3 ^;(/o, /i, /2) rather than ^(/o, /i, /2), one can 
take fs = 1, and the reader may wish to do so initially in the argument that follows in order to 
simplify the exposition. 
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The right-hand side of (|2.7|) can be viewed as a twisted, weighted variantQ of the 
Gowers norm [10]. To control it, we observe the self-averaging identity 

EheaFih) = EheG^cecFihc) 

for any non-empty set C and any function _F : G — > C. We apply this identity 
with C equal to the centraliser Z{b) :— {c G : cb — be} of b and F equal to 
the expression being averaged on the right-hand side of (12.7^ : the point of this 
averaging is to exploit the trivial observation that the function A^j^^-i/o does not 
change if one replaces h by he for an arbitrary c e Z{b). We conclude that 

|A3,g(/o, a, /2)1 < i'E,,^b,g,heGEcez{b)^bfaiz)A,M,-^fa{zgc-^h-^g-^c-^h-^) 
We can rewrite the right-hand side as 

(2.8) \-Ef,^heG^z^G^bk{z){^hbh-^h * fib.h)iz)\'^^ 

where flb^h is a weighted versioijl of fib,h' 

fl'b,h EggGEcGZ(b)^gc-ih-ig-ic-i/i-i Ag-ib-ig/3(5)Ag-l;,-lg/3(/lCg). 

Our task is now to show that 
(2.9) 

\Eb,heG'E'zeG^bfaiz)iA,,bh-^fa*iib,h)iz)\ <^CoD-^/^+'Ei,MeG ^''■'^ 

yeG:/x6,hte)>Co/|G| 

From (|2.ip we see that 

\'EzeGAbfo{z){A,Mi-^fo*P-bji){z)\ < \\f^b,h\\s(G) + l^zeG^bfoiz)] 

(by splitting Ab/o into constant and mean zero components). We may thus upper 
bound the left-hand side of (|2.9I) by 

'E'bJieGWfj'bJiWsiG) + ^beGl^zeG^bfoiz)]- 

The second term is equal to A2 Q{fo, fo), which by Lemma fOl is bounded by D~^l'^. 
As for the first term, we see from (|2.4p and the pointwise bound \ jj,b.h{x)\ < fib,h{x) 
that 

UbAsiG) <CoD-^/^ +Eb,heG J2 ^''.''(2/) 

y&G:tJ.b.h{y)>Co/\G\ 

for each b, h. The claim follows. □ 



^Indeed, in the model case when /s = 1 and G is abelian, the right hand side simphfies 
to (E^ i, fi^QAi,fo(z)Ai,fo{zh~'^))^^*, which (in the case that G has odd order) is precisely the 
Gowers norm ||/o || (72(13) . 

^Returning to the model case when = 1 and G is an abelian group of odd order, we have 
in this case that = 1/|G|, and 1 12.811 is again just the Gowers norm ||/o||j/2((3). The point is 
that for certain non-abelian groups G, one can still obtain some sort of equidistribution control 
on fij, ^ that makes it behave roughly like the uniform distribution 1/|G|. 
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3. The case of SL2 

We can now establish the d — 2 case of Theorem [L4l which serves as a simpHfied 
model for the general d case. From Proposition 11.21 and Proposition 12.21 it will 
suffice to show that 

(3.1) ^b^h^G t^bAy)<-\F\-^ 

■y6G:p6A(y)>Co/|G| 

for some absolute constant Co > 1, where ^h,h was defined in (j2.6p . 

We now need to understand the distribution of ^Mbji- Call an element h of SL^iF) 
regular semisimple if its two eigenvalues (in the algebraic closure F) are distinct, or 
equivalently if traced ^ ±2. It is easy to see that all but 0{\F\'^) elements of G are 
regular semisimple. Since G has cardinality comparable to \F\^, and each of the nt^h 
is normalised in , we thus see that the contribution of the non- regular semisimple 
b to p.ip is 0{\F\^^), which is acceptable. Thus we may restrict attention to the 
regular semisimple b. 

Now we study the quantity Hbjiiv)- It is a classical fact that |F| <C \Z{b)\ <C \F\ 
(this also follows from the Lang- Weil bound, Proposition lA.Sp . As such, we have 

f^bAv) « l^r'l{(5,c) e G X Z{b) : gc-^h-^g-^c-^h-^ = y}\ 

which we rewrite as 

fit.Ay) « l^r'l{(5, c) G G X Z{b) : gc-'h-'g-' = yhc}\ 

If c~^h~^ is central (i.e. equal to ±1), then y = 1, and the contribution to iib,hii) 
of this case is 0(|F|~^). Now we consider the contribution of those c for which 
c^^h^^ is not central. Then the centraliscr of c^^ft-^^ has cardinality ^ and so 
every element k of SL2{F) of the same trace as c^^h^^ has 0(|i^|) representations 
of the form gc~^h~^g~^. Of course, if k does not have the same trace as c~^h~^, 
it has no such representations. We conclude that 

^J■bM{y) < \F\^^^v=i + l^r^l{c e Z{b) : trace(?//ic) = trace(c~i/i-i)}|. 
For a e SL2{F), we see from direct computation (or the Cayley-Hamilton theorem) 
that trace(a^^) = trace(a). We thus have Hb.hiy) ^ |^|~^ for y = 1, and for y ^ 1 
we have 

Mb,/i(y) < l-PT^Kc e Z{b) : tTa,ce{yhc) = trace(/ic)}|. 
The centraliser Z{b) are the F-points of the algebraic variety Z{b) := {c 6 SL2{F) : 
cb = be}, which is a curve of co mplexitjQo(l). From Bezout's theorem, we conclude 
that the quantity |{c G Z{b) : trace(y/ic) = trace(/ic)}| is bounded by 0(1) unless 
the equation trace(j//ic) = trace(/ic) holds for all c G Z{b), in which case this 
quantity is bounded instead by \F\. For Go a sufficiently large absolute constant, 
we thus have 

J2 t^bAy)^\F\'' + \F\'^\YbM\ 

V&G:fj.i,hiv)>Ca/\G\ 

where Yb^h is the set of all y G G such that trace{yhe) = trace(ft-c) for all c G Z{b). 
It will thus suffice to show that 

\Yb.h\ « \F\ 

whenever b is regular semisimple. 



■^The complexity of an algebraic variety is defined in Definition lA.il 
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Fix such a b. We may find a basis of F over F that makes b diagonal. As b is 
also regular semisimple, we conclude that 

W)-{{1 ^)--teF\0] 

in this basis, and so the constraint tTace{yhc) = trace(/ic) for all c G Z{b) is 
equivalent to the requirement that yh — h vanishes on the diagonal. This constrains 
Yb^h to a two-dimensional subspace of the four-dimensional vector space Mat2x2(-F) 
of 2 X 2 matrices; as y also needs to have determinant 1, we conclude that Yb^h is 
constrained to a complexity 0(1) curve in this plane. By the Schwarz-Zippel lemma 
(see Lemma [5^2]) ■ we conclude that |Yfc,/i| ^ as required. 

4. The case of SL^ 

Now we turn to the general case of Theorem II. 41 This will basically be a reprise 
of the arguments in the preceding section, but with a heavier reliance on algebraic 
geometry in place of ad hoc computations. 

We allow all implied constants to depend on d. As before, by Proposition 11.21 
and Proposition 12.21 it suffices to establish the bound (|3.ip . We may assume that 
\F\ is sufficiently large depending on d, as the claim is trivial otherwise. 

Again, call b G SLj^{F) regular semisimple if it is diagonalisable in F with 
distinct eigenvalues. A well-known computation gives 

d-l 

\GLa{F)\ ^ Hd^l' - = (1 + 0{\F\-^))\Ff-, 

i=0 

since |G| = \GLd{F)\/\F^\, we conclude in particular that 

(4.1) \Ff-^ « |G| « \Ff-' 

(this also follows from the Lang- Weil estimate. Proposition |Aj3j. If b is not regular 
semisimple, then its characteristic polynomial has a repeated root. This constrains 
b to an algebraic hypersurface of SLd{F) of complexity 0(1). This hypersurface 
has dimension — 2, so by the Schwarz-Zippel lemma (see Lemma [A. 2[) . we have 
that at most 0(|F|'' elements of G are not regular semisimple. This is only 
0(|F|~"'^) of the elements of G, so to prove (|3.1|) it suffices as before to consider the 
contribution of the regular semisimple b. 

If b is regular semisimple, then the centraliser Z{b) of b consists of the F-points 
of a d — 1-dimensional torus Z{b) in SLd{F), of complexity 0(1), defined over F. 
By the Lang- Weil bound (Proposition rO]). we have \F\'^-'^ < |Z(6)| < ji^l'^-^ 
Arguing as in the previous section, we thus have 

(4.2) ^ibAy) « |J^r''"'+'l{(5, c) e G X Z{b) : gc-'h-^g-^c-'h-^ = y}\ 
Let (t)b^h ■■ SLd(F) X Z{b) SLd(F) be the map 

(4.3) (j^bAg, c) gc'^h-'g~^c-'h'\ 

This is a regular map of complexity 0(1) from the -l-d — 2-dimensional irreducible 
variety SLd{F) x Z{b) to the d^ — 1-diniensional variety SLd{F). 

Suppose that (6, h) is such that the map (l)b_h is dominant. Applying Proposition 
IA.5[ we see that there exists a subset E of SLd{F) x Z{b) which can be covered by 
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0(1) varieties of complexity 0(1) and dimension at most cP + d — 3, such that for 
each y e SLci{F), the set 

|{(g,c) G iSLd(F) X : M9,c) = y} 

is covered by 0(1) varieties of complexity 0(1) and dimension at most d — 1. 
Applying the Schwarz-Zippel bound (Lemma IA.2p . we conclude that 

|{(5,c) G (SLdiF) X Z(6))\S : Md.c) = y}\ « \F\^-' 

for all y G G, and thus by (|4.2p one has 

A^ft,h(2/) « |Fr'''+i + |Fr'^'-'^+2|{(g,c) G (GxZ(6))nS : .gc-^/.- VV^/i-^ = y}\. 
By (|4.ip . we conclude (for Co large enough) that 

E ^'bAy) « |Fr'''-'^+2 E |{(.g, c) e (G X Z(6)) n E : gc-^/i-ig-ic- 

yeG:/x6,h(y)>Co/lGl yeG 

= |i^|-d'-d+2|(G X Z(6))nE| 

and hence by another application of Schwarz-Zippel, we have 

aeG:Mb,h(a)>Go/|G| 

when /i is dominant. On the other hand, when (f)i,,h is not dominant, we may 
crudely bound 

yeG:Mb,h(a)>Co/|G| ySG 

To establish p.ip . it thus suffices to show that there are at most 0(|i^|~^|Gp) pairs 
(&, /i) e G X G with b regular semisimple and (l)b.h not dominant. 

Fix b regular semisimple. It suffices to show that (j)b,h is dominant for all but 
at most 0(|i^|~^|G|) values oi h G G; by the Schwarz-Zippel bound (Lemma IA.2p . 
it suffices to show that /i is dominant for all h G SLd{F) outside of 0(1) alge- 
braic varieties of positive codimension and complexity 0(1). As this assertion only 
involves F and not F, we may now diagonalise b over F, and work in a basis in 
which b is diagonal (with coefficients in F rather than in F). The torus Z{b) is 
now the group T{F) of diagonal matrices in SLd{F). It now suffices to establish 
the following claim: 

Proposition 4.1 (Quantitative generic non-degeneracy). Let k be an algebraically 
closed field, and let d > 1; we allow all implied constants to depend on d. Then 
for all h G SLd{k) outside o/ 0(1) algebraic varieties of positive codimension and 
complexity 0(1), the map (ph ■ SLd{k) x T{k) SLd{k) defined by 

(4.4) ~^h{9.c):=gc-^h-^g-^c-^h 

is dominant, where T{k) denotes the group of diagonal matrices in SLd{k). 

Indeed, by setting k equal to the algebraic closure F of F, and noting that 
4>b,h = 4'hh~'^, the claim follows. (We have shifted iph in order to map the identity 
(1,1) to the identity 1.) 

It turns out that by using an ultraproduct argument, one can show that Propo- 
sition UT] is implied by the following, seemingly weaker, qualitative variant of that 
proposition, in which the uniform bounds on the exceptional set are dropped: 



MIXING FOR PROGRESSIONS IN NON-ABELIAN GROUPS 



13 



Proposition 4.2 (Qualitative generic non-degeneracy). Let k he an algebraically 
closed field, and let d > 1. Then for generic h £ SLd{k) (that is, for all h outside of 
a finite union of varieties of positive codimension) , the map 4>h ■ SLd{k) x T(fc) — > 
SLd{k) defined by ()4.4p is dominant. 

Indeed, if Proposition 14.11 failed, then one could find d > 1 and a sequence fc„ of 
algebraically closed fields such that the set of ft. G SLd{kn) for which (j)h fails to be 
dominant cannot be covered by n algebraic varieties of positive codimension and 
complexity at most n. Performing an ultraproduct with respect to a non-principal 
ultrafilter on the natural numbers (see [3 Appendix A]), we then obtain a new 
(and much larger) algebraically closed field k, with the property that the set of 
h € SLd{k) for which (p^ fails to be dominant cannot be covered by any finite 
number of algebraic varieties of positive codimension, contradicting Proposition 
14.21 (Here we use the continuity of irreducibility and dominance with respect to 
ultraproducts; see [7l Lemma A. 2] and fTj Lemma A. 7].) 

It remains to prove Proposition 14.21 By the irreducibility of SLd{F), it suffices 
to show that the derivative map 

D4>h{^,l):5ld{k) X t{k)^5ld{k) 

is full rank for generic h e SLd{k), where sid{k) is the vector space of trace zero 
d X d matrices over fc, and t{k) is the subspace of sld{k) consisting of diagonal 
matrices over k of trace zero. From the product rule and (j4.4p . we may evaluate 
D(ph{l, 1) explicitly as 

D4>hil, 1){X, Y)^X - h-^Xh -Y - h-^Yh 

for X e 5ld{k) and Y 6 t(fc). 

We may restrict attention to h which are regular semisimple (or equivalently, 
those h whose characteristic polynomial has no repeated roots) , as the complement 
of this set is certainly contained in a finite number of algebraic varieties of positive 
codimension. We may thus diagonalise h = ADA~^ for some A e SLd{k) and 
diagonal D with distinct diagonal entries. Then we have 

D4>hil, l)iX, Y) = A{X' - D-^X'D - y - D-'^Y'D)A-'^ 

where X' := A-^XA and Y' A-^YA. We thus see that £>^/i(l, 1) is full rank if 
and only if the map 

{X',Y') ^X' - D~^X'D -Y' - D^^Y'D 

is a full rank map from 5id{F) x A^^t{F)A to s{d{F). It thus suffices to show that 
this map is full rank for generic A S SLd{k) and D G T{k). 

As 13 is a diagonal matrix with distinct diagonal entries, we see that the image 
of sld{k) under the map X' i-^ X' — D^^X'D is the space of all matrices that 
vanish on the diagonal. To show that D(j)fi{l, 1) has full rank, it thus suffices to 
show that the map Y' M- diag(y' + D'^Y'D) has fuU rank from A^^i{F)A to i(F). 
Since diag(y + D^^Y'D) ~ 2diag(y), it suffices to show that the diagonal map 
Y' ^ diag(r') has fuU rank from A-^i(F)A to t(F) for generic A £ SLd{k). As 
this is clearly a Zariski-open algebraic constraint, and contains the case A = 1, we 
conclude that one has full rank for generic A, and the claim follows. 



14 



TERENCE TAO 



5. Expansion 

In the remarkable paper of Bourgain and Gamburd 6 , the quasirandomness 
properties of SL2{F), combined with the product theory in such groups (see [H]), 
were used to estabhsh spectral gaps for the generators of various Cayley graphs. 
In our notation, the results of [6] established spectral gap results, a typical one of 
which is the assertion that with probability 1 — Op^oo(l), one has 

\\^{Sa + Sb + Sa-i + Sb-l)\\s(SL2{Fp)) < 1-C 

for some absolute constant c > 0, where Fp is a finite field of prime order and a, b is 
chosen uniformly at random from SL2{Fp). This result has since been generalised 
in a number of different directions; see [17] for a survey. 

In this section, we establish some related expansion results, but instead of a 
probability measure (such as j{Sa+Sb+Sa-i +Sb-i)) supported on a small number of 
points, we will establish spectral bounds on (quasi-)probability measures distributed 
more or less uniformly on subvarieties V of SLd', this will play an important role 
in the proof of Theorem 11.51 in later sections. The main result is that as long as 
V is not "trapped" in an algebraic subgroup of SLj, (or a coset thereof), there is 
a spectral norm bound which gains a power of \F\ over the trivial bound. The 
arguments are very much in the spirit of Bourgain and Gamburd [6] , with the main 
ingredients being "escape from subvarieties" , quasirandomness, and some basic 
algebraic geometry. However, due to the algebraic structure of the measures being 
studied, combinatorial tools such as the product theorem of Helfgott [T3] are not 
required in this argument (though they could certainly be deployed in order to 
prove more general results, in which the measure in question is not assumed to be 
adapted to an algebraic subvariety). 

More precisely, we will establish the following result. 

Proposition 5.1 (Expansion from subvarieties). Let k be an algebraically closed 
field, and let F be a finite subfield ofk. Let V C SLd{k) be an irreducible algebraic 
variety defined over k of complexity at most M . Suppose that V is not contained 
in any coset Hg of a proper algebraic subgroup H of SLd{k). Then one has 

\\lJ-\\siSLa{F)) <d,M \F\'^'"'^^'>^''\\^i\\L■^(VnSLa(F)) 

for all fi : SLd{F) — >■ C supported on V H SLd{F), where c > depends only on d. 

Proof. We perform a downward induction on dim(V^), which is an integer between 
and dim{SLd) = - 1. When dim{V) = dim{SLd), the claim follows from ((Ol) . 
(j4.1[) . and Proposition 11.21 Now suppose that dim(F) < dim{SLd), and that the 
claim has already been proven for all larger values of dmi{V). 

We normahse \\tJ-\\L'=°{vnSLa{F)) '■= and allow aU implied constants 

to depend on d and M, so our task is now to show that 

Ms{SL,iF)) < \F\-^. 

Recall the TT* identity 

\\TT*\\,p - llTll^p 

whenever T is a bounded linear operator between Hilbert spaces. Applying this to 
the convolution operator / n> / * on the Hilbert space of mean zero functions on 
L^(G'), we conclude that 

IIA* */^lls(SL4F)) = yWsiSL^iF)) 
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where jl : G ^ C is the function p'ig) ■— ^J■ig ^). It will thus suffice to show that 

IIm*mIIs(sl,(f)) « \F\-^ 

for some c > depending only on m, d. (Note that as there are only 0(1) different 
values of dini(F), we may allow the value of the constant c to change with each 
step of the induction.) 

We consider the product map ip : V x V ^ SLd{k) given by <p{v,w) :— vw^^, 
and let W be the Zariski closure of (/^{V x V). As F x is irreducible, W' is 
also irreducible. As W contains a translate of V, we have dini(W^') > dim(y). 
We claim that we in fact have strict inequality dim(Ty) > dim(l/). To see this, 
suppose for contradiction that dim(M^') — dim(V^). Then for each w G V, Vw^^ 
is contained in the irreducible variety W , and has the same dimension as W , and 
so Vw~^ = W for aU w €V. This implies that W'{W')~^ = (t>{V x V) C W, or 
in other words that W' forms a group, and is thus a proper algebraic subgroup of 
SLd{k). But V is contained in a coset of W, contradicting the hypothesis on V. 
Thus we have dim(14^') > dim(V^). 

We now apply Proposition IA.5| to conclude that W' has complexity 0(1), and 
that there is a subset 'S oi V x V covered by 0(1) varieties of complexity 0(1) 
and dimension strictly less than 2dim(l/), such that for each w G W, the set 
{{v,v') G V X V\I^ : 4){v,v') — w} is contained in 0(1) varieties of complexity 0(1) 
and dimension at most 2dim(V^) — dim(W^'). Applying the Schwarz-Zippel bound 
(Lemma IA.2p . we conclude that 

(5.1) |En(Gx G)| < 
and 

(5.2) \{{v,v') e {{V xV)niGx G))\S : (t>{v,v') = w}\ < |^|2dim(y)-dim(iv')^ 
Next, we expand 

iv,v')e(VxV)niGxG):<l}{v,v')=w 

and then decompose 

/X * /i = /ii + ^2 

where 

{v,v')e^n{GxG):cl>{v,v')=w 

and 

{v,v')e{{VxV)n(GxG))\^:<p{v,v')=w 

As ||/^||L»(y) = iFl-'i™^^), we see that 

ll/^llk(G) < E |^^|-dim(V)|^|-dh„(y) 

(5.3) iv,v')e^n{GxG) 

« i^r' 

thanks to (|5?T|) . By (|2?2|) . we thus have 

llMills(G) « \F\-\ 
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Next, from ()5.2p and the normalisation ||/i||ioo(y) — \F\ dim(v) have 

^2{w) <C |i^|2dim(y)-dim(H'')|^|-dim(y)|^|-dim(y) _ |^|-dim(W") 

for all w Cz G. As fi2 is supported on W' , we conclude from induction hypothesis 
that 

iim2||s(g) « \Fr 

for some c > depending only on d, and the claim follows. (Note that as W' 
contains a translate of V, it cannot itself be contained in a coset of a proper algebraic 
subgroup of G.) □ 

We remark that the above proof in fact allows one to take c := 2^^ ' ' 
We will apply Proposition lS.ll in the case of a function fi supported on a conjugacy 
class: 

Corollary 5.2. Let F be a finite field, let d > 2, and let a £ SL^{F) be non-central 
(i.e. a is not a multiple of the identity). Let C{a) := {gag~^ : g £ SLd{F)} be the 
conjugacy class of a. Then 

ncia)\\siSL,iF)) \Fr\C{a)\ 
for some c > depending only on d. 

Proof. We allow all implied constants to depend on d. We apply Proposition 15.11 
with k equal to the algebraic closure of F, and V equal to the conjugacy class 
C(a) := {gag^^ : g G SLd.{k)}. It is clear that V is an irreducible algebraic variety 
defined over k of complexity 0(1). Proposition lS.ll will give the desired claim unless 
C{a) is contained in a coset Hg of a proper algebraic subgroup H of SLd{k). But 
this implies that H contains C (a) •C(a) , which implies that the group N generated 

by C(a) • C{a) is a proper subgroup of SLd(k). But this group is conjugation- 
invariant and thus normal. It is a classical fact that the algebraic group SLd{k) is 
almost simple, in the sense that the only normal subgroups are finite (in fact, the 
maximal normal subgroup is the center, or equivalently the quotient PSLd{k) is 
simple). This implies that C{a) is finite. But this contradicts the hypothesis that 
a is not central, and the claim follows. □ 

Remark 5.3. A standard application of Schur's lemma gives the identity 

^bec(a)P{b) = dimV) ^^^^^^ P''"'^^^^ 

for any non-trivial irreducible unitary representation p : SLd{F) — > U{V), where 
ly denotes the identity operator on V . From this and Remark \2.1\ we see that 
Corollary is equivalent to the assertion that |tracep(a)| ji^j""^ dim(y) for 
any non-trivial irreducible representation p : SLd{F) U{V) and any non-central 
a. It is likely that this result could also be established directly (with an optimal value 
of c) from the representation theory of SLd{F), but we will not do so here. 

6. A REDUCTION TO A BOREL GROUP 

We will abbreviate o\p\^oo{) as o() throughout the rest of this paper. 
We now begin the proof of Theorem II. 51 bv making some reductions. The first is 
to use the Cauchy-Schwarz inequality to reduce Theorem 1 1.5 1 to a seemingly weaker 
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statement in which the absolute values have been moved outside of the g averaging. 
In other words, we will deduce Theorem 1 1 . 51 from the following statement: 



Theorem 6.1. Let F be a finite field, and set G := SL2{F). Let S denote the set 
of all hyperbolic elements of S L2{F) . Then for any functions /o, /i, /2j /s '■ G C, 
we have 

3 3 3 

|EgesE,6G n - n « ll/dk~(G)). 

1=0 i=0 1=0 

Let us assume Theorem 16.11 for now and see how it implies Theorem 11.51 If /s 
is constant, then the claim follows from Theorem 11.41 so we may assume without 
loss of generality that /s has mean zero. We may take the /,; to be real-valued, and 
also normalise ||/j||L°o(G) = 1 for each i. Our task is now to show that 

3 

Eges\E■,eGl[f^{xg"-^)\^oil). 

i=0 

By Cauchy-Schwarz, it suffices to show that 

3 

Eg^s\^^,^GY[f^{xg'-')\' ^o(l). 

i=0 

which we square as 

3 

Eg^sE,.yeGYlf^{xg'-')f^{yg'-^) = o(l). 

i=Q 

Substituting y — hx, we can rewrite the left-hand side as 

3 

EhecEgesE^^eo H f^{xg''^)f^{hxg'~^). 
Applying Theorem 16. 11 we have 

3 3 

EgesE.,eGl[f^{xg'~^)f^(hxg'-') ^l[E^^Gf^{x)f^{hx)+o{l) 

i=0 1=0 

for each h E G, so it suffices to show that 

3 

\Eh^Gl[E^eGf.{x)f^{hx)\^o{l). 

i=0 

We can bound the left-hand side in magnitude by 

EfteG|E,6G/3(x)/3(MI 

and the claim now follows from Lemma 11.31 (applied to the reversed function x i— 

Mx"'))- 

It remains to establish Theorem l6.11 We will deduce it from the following variant 
theorem on the standard Borel subgroup B of SLd{F). 

Theorem 6.2. Let F be a finite field, and B be the subgroup of matrices in SL2{F) 
which are upper-triangular. Let U be the normal subgroup of B consisting of matri- 
ces which are equal to the identity matrix except possibly at the upper right entry. 
Let fo, . . . , fs : B C. Then 

A4,b(/q, . . . , /s) = A4,i3(/o *l^U,---,f3*l^u)+ o(||/o||Loo(i3) ■ ■ ■ 1|/3||l~(B)) 
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where ^jj :— jijj^u ■ 

Let us assume Theorem 16.21 for now, and show how it impUes Theorem 16.11 We 
may again assume that /a has mean zero, and that the fi are real-valued with 
ll/j||i°°(G) — 1 for each i. Our task is to show that 

3 

\Eg^sE^^Gl[Mxg'-')\ ^ o{l). 

i=0 

The first task is to replace the set S by the set B as follows. Observe that B 
is the space of all matrices in SL2{F) that fix the span span(e2) of the second 
vector 62 of the standard basis 61,62 of F^. Any conjugate gBg^^ of B, where 
g S SL2{F), would fix another line; this new line would be identical to the original 
line span(62) precisely when g G B, so the total number of such conjugates is 

|SL2(F)|/|i?| = (l + 0(|^^ri))|F|. 

If 5 G S* is regular semisimple, then it has two distinct one-dimensional eigenspaces 
in F, and thus preserves 21 — 2 distinct lines. As such, it lies in gBg~^ for 2\B\ dif- 
ferent values of B. We thus see that the number of regular semisimple elements of S 

\G\ • 

is equal to times the number of regular semisimple elements of B. An element 
of B is regular semisimple if and only if its diagonal entries are distinct, so we see 
that the proportion of elements of B that are regular semisimple is 1 — 0{\F\~'^). 
We conclude that there are -f 0(|-F|^^))|G'| regular semisimple elements of S. 
As all but 0(|i^|^^|G|) elements of G (and hence of S) are regular semisimple, we 
thus see that 

^gesfig) = EgecE^ggSg-i/W + 0{\F\-^) 
for any function / : G — > C of magnitude 0(1). It will thus suffice to show that 



^geG'^hegBg-^^xeaXlHxh'-^) - o(l) 



j=0 

Fix g G G. By foliating G into left cosets agBg^^ of gBg^^, and applying Theorem 
16.21 (conjugated by g) to each coset, we see that 

3 3 

EhegBg-iE^gagSg-l JJ^ /i(a;ft-'^"^) = 'EhegBg-^^xeagBg-^ ]^(/j*Mg;7g-i)(2;^* + 

1=0 2 = 

for each a. It thus suffices to show that 

3 

Egea'EhegBg-^'ExeGYiifi * f^gUg-^){xh'~''^) = 0(1). 

i=0 

Applying the crude bound 



E/iegSg-iE2:eG]^(/j *Mg[/g-i)(2^^' ^] 
i=0 

it suffices to show that 



< E^eGl/3 *MgC/g-i(a 



EggcE^ecl/s *AigC/g-i(a;)l = o(l). 
By Cauchy-Schwarz, it suffices to show that 

EggGE:,;eG|/3 * fJ-gUg-^ = o(l). 
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From the identity 

it suffices to show that 

|EgeGE^eG/3(a;)(/3 * tigUg-^){x)\ = o(l). 
By definition of the reduced spectrai norm, the ieft-hand side is bounded by 

W^gGGf^gUg-^Ws- 

Observe that 

EgGGMg(7g-i — ^ueu'E^geoSgug-^ = Eugjy— — — - 

\Ly(U)\ 

and so by Minkowsfci's inequaiity 

llEgGGMstfg-i lis < E„gtr|^j^--y|||lc'(„)||s- 

By Coroilarv l5.21 we may bound I|1c(m) II 5 by li^l^*^ for some c > depending 

only on d, except when u is the identity element, in which case we have the trivial 
bound of 1. As U has cardinality |F|, we obtain a net bound of OdFj"-"^ + 
and the claim follows. 

It remains to establish Theorem l6.2l This is the purpose of the remaining sections 
of the paper. 

7. Progressions in a Borel group 



We now prove Theorem [ 
By splitting each function fi into functions constant along cosets of C/, or having 
mean zero along cosets of U, we see that it suffices to show that 

A4,b(/o, • • • , /a) = o(||/o||l~(s) ■ • • ||/3||l~(b)) 

whenever at least one of /o,/i,/2,/3 has mean zero along cosets of U. By the 
symmetry 

A4,b(/o,---,/3) = A4,s(/3,---,/o) 
we may assume that has mean zero along cosets of U for some io G {2, 3}. We 
may also take /o, /i, /2, fs to be real- valued with L°°{B) norm of 1, so our task is 
to show that 

^x,geBh{x)h{xg)f2{xg^)h{xc/) = o(l). 
We will take advantage of the short exact sequence 

between the additive group F — {F, +), the Borel group B, and the multiplicative 
group F^ :— (_F\{0}, •), given by the inclusion map l : F B and the projection 
map n : B ^ F^ defined by the formulae 



i{a) ■■=[1 I 



and 
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For any a,b ^ F, we can make the change of variables {x,g) n- {L{a)x, L{b)g) and 
write 

^x,geBfa{x)fi{xg)f2{xg^)f3{xg^) = Ex,g<zBfo{i{a)x)fi{L{a)xL{b)g)f2{i{a)xL{b)gL{b)g)f3{L{a)xL{b)gL{b)gL{b)g). 
By using the identity 

xt{b) — i{'K{xYb)x 
for any x G B and b G F, we can rewrite the above identity as 

^x,geBfo{x)fi{xg)f2{xg'^)f3{xg^) = Ex^g^Bfo{i'{a)x)fi{i{a + n{x)'^b)xg)f2{i-{a + 7r(a;)^6 + 7r(xg)^&)xg^) 

/3(t(a + TT{xfb + Tr{xgfb + TT{xg'^)b)xg^). 
On averaging in a, b, we conclude that 

^x,geBfo{x)fi{xg)f2{xg'^)f3{xg^) = E^^g(,BEa,beFfo,x{a)fi,xg{a + 7r(x)^6)/2,j;g2 (a + 7r(a;)^6 + ■n{xgYb) 

h,xg^{a + ■n{xYb + TT{xgfb + ■n{xg'^fb) 

where fi^x : — S- R are the functions 

fiA") ■= fi{t-{a)x). 

By dilating b by 7r(a;)^, we may simplify the above expression slightly as 

E:,,geBEa,bej./o,:,(a)/i,^g(a+6)/2,^g2(a+(l+7r(.9)2)6)/3^^g3(a+(l+7r(.g)2+7r(.g)'')&) 

As is well known, the inner average has too high of a "complexity" to be directly 
treated by Fourier analysis. However, following Gowers '10', we may reduce to a 
form tractable to Fourier analysis after applying the Cauchy-Schwarz inequality. 
Indeed, from that inequality we can bound the preceding expression in magnitude 

by 

(E,,geBE„e;^|Ebef/i,,3(a+6)/2,,g2(a+(l+^(.g)2)&)/3^^g3(a+(l+7r(.g)2+7r(g)4)&)p)i/2. 
We may expand this expression as 

{'Ex,geB'Ea,bMeFfi,xg{a + b)fi^xg{a + b')f2,xg^ (a + (1 + TT{gf)b)f2,xg^ (a + (1 + n{gf)b') 

h,xg<a + (1 + ixigf + 'K{gf)b)h^xg.{a + (1 + -.{gf + ^(.9)^)6'))'/'. 
Writing b' = b + h and shifting x hy g, this becomes 

(Ex,geBEheF'Ea,beF^hfiAo- + i')\i+T:(g)^)hh,xg{a + (1 + 7r(g)^)&) 
A(i+^(g)2+^(g)4),j3^^32(a + (1 + Ti{gf + ■n{gf)b)y-''^ 

where Ahf{a) := f{a)f{a + h). 

Shifting a by 6, then dilating b by 7r((;)~^, we may simplify this slightly as 

(E:E,geBE/,gi?Ea,6e-FAft/i^^(a)A(i+^(g)2)^/2,j:g(a+fo)A(i+^(g)2+^(g)4);j/3 .j.g2(a+(l+7r(5)^)6))^/^ 

and so our task is now to show that 
(7.1) 

Ei,geBE/ie_FEa,6e-FA/i/i,j,(a)A(i+^(g)2)^/2,:rg(a+&)A(i+7r(3)2+,r(3)i)/i/3,xg2(a+(l+7r(5)^)fe) = 0(1). 
The next step is Fourier expansion. Consider the trilincar form 
^a,b^FHi{a)H2{a + b)H3{a + (1 + 7r(g)2)6) 
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for some functfons Hi, H2, H3 : _F — )• C. Using some arbitrary non-degenerate 
bilinear form ■ : F x F -i- R/Z, we can form the Fourier series 

i7,(a) = ^i?,(0e(e-a) 
for 1 = 1,2, 3, where e(x) := e^'^'^ and 

Inserting these Fourier series and simplifying, we arrive ath the identity 
i:a.beFHi{a)H2{a+b)H3{a+{l+7T{g)^)b) - ^ ffi(e)i?2(-(l+vr(.g)-2)0^3(^(.9)~'0- 

We may thus write the left-hand side of (|7.1I) as 

Splitting off the ^ = and C 7^ terms, we see that to prove (|7.ip . it will suffice to 
establish the bounds 

(7.2) 'E,x,geB'EiheF^hfi,xiO)^{i+TT{gy)hf2,xg{0)^{i+TT{gy+TT{gy)hf3,xg^iO) = 0(1) 
and 

(7.3) 

'Ex.ges'EheF ^ \Ahfl,x{0\\\l+-!T{gy)hf2,xg{~{i+Tr{gy'^)0\\^{l+Trig)^+Trig)'')hf3,xg^{T^{g)''^0\ 

7.1. The contribution of the zero frequency. We now prove (|7.2p . We have 

AhfuxiO) = EaeFfi,x{a)fux{a + h) 
and thus by Fourier expansion 

aXx(0)= E l/i,.(ei)l'e(Ci-/i). 

Similarly we have 

A(H-.';^ft/2,.s(0) - \hM^2)\^e{{l + 7:{9fr)(2 ■ h) 
i2eF 

and 

\i+^igf^g)^)h.f2M0) = \kxg^-i^3)\''e{{l+7T{gf+TT{gf)+p{xg^x-^y^3-h). 

Inserting these identities and performing the h averaging, we conclude that the 
left-hand side of (17.21) can be rewritten as 



E \flA^l)\'\f2,xg{^2)W3,xg^m^- 
«l,«2,«3eF:^i + (l+7r(3)2)42 + (l+,r(g)2+7r(g)'i)43=0 

Recall that fig was assumed to have mean zero on cosets of H, which implies that 
we may restrict to be non-zero. We note that the quantity \fi^xi,0\'^ is unchanged 
if one multiplies x on the left (or right) by an element of U, and so we may write 
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for some non-negative quantity Aii,t(C): defined for i = 1,2,3, i G , and ^ & F. 
We can then simphfy the previous expression as 

(7.4) E^^tg^x ^ Ml,s(Cl)M2,st(6)M3,st2(C3)- 

«l,?2,e3eF:5i+(l + t2)52 + (l + t2+t4)^3=0:C,o #0 

To show that this expression is o(l), it will suffice to establish the combinatorial 
bound 

(7-5) 'EsJePx Ir,i(s) + (l+t2),,2(st) + (l+t2+t4)r,3(st2)=0 = o{l) 

for any choice of functions r/i : F'^- F for i — 1,2,3, with r]ig non-zero. Indeed, 
by the Plancherel identity we have 

for all i = 1, 2, 3 and s G F^ , with fiig^s{0) — 0, so we may find random functions 
rji : F^ F with rji^ nowhere vanishing, and with the property that 

for all i = 1,2,3 and s e F^. Applying (|7.5I) with these functions and taking 
expectations, we conclude that the quantity (|7.4p is o(l) as desired. 

It remains to establish ()7.5p . which is a bound of "sum-product" type, in that it 
is asserting a certain combinatorial incompatibility between the multiplicative and 
additive structures on F. Assume for contradiction that we can find arbitrarily 
large finite fields F and functions ?7i,?/2,'73 ■ F^ F with rji^ nowhere vanishing, 
for which 

Es,tGFxlj)l(s) + (l + t2),,2(st) + (l + t2+t4),,3(st2)=0 > 1- 

Fix F,r]i,r]2,r]^. Let A C (F^)^ be the set of all pairs (s,t) for which 

77l(s) + (1 + t')T^2{st) + (1 + + ^4)^3(5^2) ^ 

thus I A I 3> |F^p. Applying the multidimensional Szemeredi theorem (Theorem 
IB. II) to the multiplicative group F^, we conclude that there are |Fp triples 
{s,t,r) with the property that {sr^,tr^) G A for all —100 < i,j < 100 (say), thus 

(7.6) m{sr') + (1 + r^H^)r]2{str'+^) + (1 + r^H^ + r^H^)7]3{st^r'+^^) ^ 

for all —100 < i,j < 100. We will eliminate the rji terms from (|7.6p (taking 
advantage of the non- vanishing nature of rji^) to obtain a non-trivial algebraic 
constraint on s,t,r, which will contradict the assertion that 3> |F|^ triples {s,t,r) 
exist with this property if |F| is large enough. 

We turn to the details. Fix s,t,r obeying (|7.6p . If we abbreviate rjkist^^^r'^) as 
Cfe(i), and also write aj := 1 + r'^H'^ and jSj := 1 -t- r^H"^ + r'^H'^, we have 

ci(«) + ajC2{i+j) + f3jC3{i + 2j) = 

for all —100 <i,j< 100. In particular, applying this identity for j and j + 1 and 
subtracting, we have 

aj+iC2{i+ i + 1) - ajC2{i+j) = PjCsii + 2j) - ^^+103(1 -I- 2j + 2) 
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for all -90 < i,j < 90 (say). Replacing (i, j) by {i ~ 2,j + 2), (i + 2, j - 1), and 
(i, J + 1), we obtain the system of four equations 

(7.7) aj+iC2(2 + i + 1) - ajc-2(i + j) = /?jC3(2 + 2j) - /?j+iC3(2 + 2j + 2) 

(7.8) aj+3C2(i + j + 1) - OLj+ic^ii + j) = ;3j+2C3(i + 2j + 2) - /3j+3C3(i + 2j + 4) 
(7.9) 

ajC2{i + j + 2) - aj-iC2(i + j + 1) = /?j-iC3(i + 2j) - /JjC3(i + 2j + 2) 
(7.10) 

aj+2C2(i + j + 2) - aj+iC2(i + J + 1) = /3j+iC3(» + 2i + 2) - /3j+2C3(i + 2j + 4) 

for all -80 < i,j < 80 (say). 

We now eliminate the various C2 factors in this system to obtain a linear re- 
currence in the Cj. Multiplying (j7.7p by 0^+2 and (|7.8p by a_, and subtracting to 
eliminate the c^ii + j) term, we conclude that 
(7.11) 

("i+i"j+2-aj+3aj)c2(i+j+l) = /3jQ;j+2C3(i+2j)-(/3j+iaj+2+^j>2Q!j>3(i+2j+2)+/3j+3aiC3(i+2j+4). 

Similarly, if we multiply (17. 9p by aj+2 and (I7.10p by aj and subtract to eliminate 
the Cj{i + j + 2) term, we have 

(ajaj+i-aj_iQ!j+2)c2(i+j+l) = /3j_iaj+2C3(j+2j)-(/3jQ!j+2+/3j+iaj)c3(i+2j+2)+/3j+2ajC^^ 
A brief calculation reveals that 

"i+i"j+2 - OLj^-iOij = r^(ajaj+i - aj+2aj-i) 
and so we may also eliminate C2(i + j + 1) and conclude that 
(7.12) /3;aj+2C3(z+2j)-(/3;+iaj+2+/3;+2ajOc3(i+2j+2)+/3;+3a,C3(i+2j+4) = 
for aU -80 <i,j < 80, where 

/?, - r^(3,-i = (1 - r-2)(^4j^4 _ ^2)^ 

We continue the elimination process. Applying ()7.12p with (i, j) replaced by {i + 
2,j — 1), we conclude that 

/3^_laJ+lC3(^ + 2j) - (/J^-a^+i + ^'^^^a.-Mi + 2j + 2) + /3^.+2aj-iC3(* + 2j + 4) = 
for all —70 < i,j < 70 (say). Multiplying this equation by Pj^^aj and (|7.12l) by 
P'j+2'^j-'L ^^'^ subtracting, we conclude that 

(/3j-i/3j+3"j"j+i - /3j-^j-+2aj-i"j+2)c3(j + 2j) 

= (^j/3j+3"j"j+i + /3j-+i^j+3"j-i"j - /3j+i/3j+2aj-iaj+2 - (^j+2)^aj- 10^)03(1 + 2j + 2) 
for all -70 <i,j < 70. 

We apply this with (i, j) replaced by (i — 2, 1) and (i — 4, 2) to conclude that 

{l3'„l3'^aia2-l3[l3'r^aoa3)c3{i) = (^i/3iaia2+^2^4"oai-^^^^aoa3-(^3)^aoai)c3(j+2) 
and 

(^^^^a2«3-/32/34«la4)c3(^) = (/3^/3^a2a3+/33/35«l«2-/33/34ala3-(/34)'«l«2)c3(^+2) 

for all —60 < i < 60 (say). Eliminating 03(1 + 2), we conclude that either 03(1) 
vanishes for all —60 < i < 60, or else we have the constraint 

(/3o^4"i"2 - f^'iP'^aoas) (132 1350203 + /33/35aia2 - (33/340103 - (^4)^aia2) 

= {I3[p'502a3 - f3'2f3'40ioM(3'40i02 + /32/34«oai - /Ja/^s^oas - iPifaoOi). 
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After eliminating some factors of (1 — r^^), this is a polynomial constraint between 
r and t of bounded degree. One can easily verify that the constraint is not a 
tautology (for instance, setting r = 2 and i = 2, the left-hand side is approximately 
— 1.96 X 10^* and the right-hand side is approximately 3.61 x 10'^^). Thus, by the 
Schwarz-Zippel lemma, there are only 0(|F|) possible pairs (r, t), and thus 0(|i^p) 
triples (r, s,i), that obey this constraint. Outside of those exceptional triples, we 
thus have c^{i) vanishing for all —60 < i < 60. Applying (|7.11|) . we conclude 
that 02(0) vanishes as well, unless aia2 — aacto vanishes. The latter possibility is 
also a bounded degree non-tautological constraint on r, t and so also only occurs 
for 0(|Fp) triples (r, s,t). Thus we see that 03(0) and C2(0) both vanish outside 
of these exceptional triples. But this contradicts the assumption that rji^ never 
vanishes (recall that zq is either 2 or 3). We have thus demonstrated that there are 
at most 0(|Fp) triples (r, s, t) for which holds for all -100 <i,j < 100. But 
we also know that there are ^ |Fp such triples, leading to a contradiction for \F\ 
sufficiently large, as required. 

7.2. The contribution of the non-zero frequencies. Finally, we prove (|7.3p . 
This will be done by a variant of the Cauchy-Schwarz arguments used to establish 
Theorem 11.41 Observe that one multiplies x G G on the left by some element L{k) 

of U, then fi^^ and A^fi^x become translated by k, and the quantity \ Ahfi,xiS,)\ 
unchanged. Thus, for any i — 1,2,3, x & G, h £ F, and ^ G , we may write 

(7.13) |AliQ(e)i = i/o^.^(.)(e) 

for some function Hi h -^^^.-^ : F^ — > depending on h and n^x). We may thus 
rewrite (|7.3p as 

Ese_FxEftgF ^ -ffi,;j.,s(0EtGFx-ff2,(i-i-t4)/i,st(-(l+^^'^)0-f^3,(i+t'i+t8)/i,st2(i~''^) = o 
ee-Fx 

From Plancherel we have 

CeFx 

for all s £ F^ and h £ F, so by Cauchy-Schwarz it suffices to show that 

Egg^xE/igF E |EtgFx-f^2,(l+t«)h,st(-(l + *^'*)'^)^^3,(l+t*+t**)'i,st^(^~''^)P = 0(1)! 

eeFx 

which we expand as 

Es,t,„eFxE;jgF E ^^2,(l+t*)/i,s«(^(l + ^"'')C)^-f^3,(l+t''-(-t8)/i,st2(i"'*0^ 

By another Cauchy-Schwarz and symmetry, it thus suffices to show that 

Es,t,ueFxE/,gF E ■f^2,(i+fi)/i,st(~(l + ^"'')0-f^3,(n-«''-i-««)/i.,s«2(""**0 = o(l)- 

There are at most 4 values of t for which t"^ = —1, and each of these values of t 
contributes 0{\F\~^) to the above sum (using Plancherel's theorem Hi^h.siO < 
1 and the trivial bound Hi^h,s{0 < 1), and may be discarded. Dilating h,s,^ by 
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(l+t*) ^, —{1+t ^) respectively, we rewrite the remaining component of the 
above estimate as 

Es,t,«e_FxE,,gF ^ lt''#-l-^^2,/i,s(C)-f^3,(l+u'i+nS)(l+t'i)-ih,st-iu2(-(l+^"'*)~^"~''0 = 

Making the change of variables {s,u,v) :— {s,u, st^-^u^), so that t = su^v^^, this 
becomes 

From Plancherel's theorem and the trivial bound i?2,;i.s(0 < 1 we have 

?eFx 

for each h ^ F and s £ F'^- . It will thus suffice to establish the bound 

E„eFx lsi«8„-4-^_iiJ3 (;^^„4+„s)(i+s4„8^-4)-i,,^„(-(l + s^^u^^v^y^u^'^^) ^ o(l) 

for aU ^ e F^, and aU but at most o(|Fp) choices of {s,v,h) e F'^ x F'^ x F. 

Fix s,v,h. Our task is to show that for all but o{\F\^) choices of {s,v,h), one 
has 

(7.14) E„eFUHff3'0(«),.(^H)l' = 
where A {u e F^ : s'^u^v^^ ^ -1}, 

:= (1 + + u^){l + 

and 

7^{u) := -(1 + s-\-%^)-\-^C 
We may assume that h is non-zero, as this only excludes 0{\F\'^) = o(|Fp) values 
of (s, V, h). 

If we write / := f^ g for some g G tt^^ (w) and expanding out the definition ()7.13p 
of H^ ji^s, we may rewrite (|7.14p as 

(7.15) E„ef 1^H|A;^/(77(u))|4 = o(l). 

The next step is to apply the Cauchy-Schwarz inequality again, in the spirit of 
the work of Gowers [10]. First, to show (|7.15p . it will suffice to show (using the 
trivial bound \ Ahf{r])\ < 1) that 

F,u€FlA{u)\A^f{7]{u))\ = o(l) 

or equivalently that 

E„eF&(w)A^(„)/(?7(M)) = o(l) 

for any function 6 : F — > R supported on A with \b{u)\ < 1 for all u. We can 
expand the left-hand side as 

'Ex,ueFb{u).f{x)f{x + (f)(u))e{-ri{u) ■ x), 

and rearrange this as 

^x,y&Ff{x)f{y)K{x,y) 

where 

K{x,y) :^ E b{u)e{-'i]{u) ■ x). 

u^F:(p[u)—y — x 
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Applying the Cauchy-Schwarz inequality twice, and using the boundedness of /, 
we have 

\^,,yeFf{x)f{y)K{x,y)f < E,,^,,,,^,^^ /^(a;, y)K{x, y')K{x', y)K{x', y') 
so it will suffice to show that 

'Ex,y,x',v'eFK{x,y')K{x' ,y)K{x' ,y') = o(l). 
The left-hand side may be expanded as 

b{u^)b{u2)h{us)b{ui) 

E 

x,y,x\y'^F:(j}(ui)=x-y,<l>(u2)=x-y',<p(u3)=x'-y,(p(ui)=x'-y' 

e{-{7]{ui) - 77(^2) - 7]{u3) + 77(^4)) • x)e{{T]{u3) - f](u4)) ■ {x' - x)). 

The quantity a;' — a; in the summand is equal to (fi^u^) — and so this phase 

is constant over the inner summation. By Fourier analysis, we see that the inner 
summation is thus 0(|-F|) when 77(1*1) + 77(114) = 77(112) + 77(113) and ^(ui) -|-^(u4) = 
<l>{u2) + <P{uz), and zero otherwise. It thus suffices to show that 

|{(ui, U2, W3, Ui) e : ri{ui)+ri{u4) = 77 (712) +77(^3); (/)(ui)+(/)(w4) = (/)(u2)+0(u3)}| = o{\Ff). 

Canceling out the non-zero h and ^ factors, and replacing each of the Ui by their 
fourth powers (at the cost of paying 0(1) in the cardinality bound), this becomes 

\{{ui, U2, U3, Ui) e : $(Mi) + $(U4) = $(U2) + ^(ws)}! = o{\Ff) 

where $ : F — > is the rational function 

$(u) := {{1+U + u2)(l + ku'^)-\ (1 + k-\-^)-\-^) 

and k := s^w"^. We can simplify (1 + k~^u~'^)~^u~^ as ku{l + fcw^)~^ and (1 + 

u + u^)(l + ku'^)~^ as k~^ + (1 — k~^ + ?i)(l + kv^)^^ , so after excluding the 
0(|Fp) = o(|F|"^) triplets {s,v,h) for which fc = 1, we may replace $ by 

|.(w) := ((l + fc7i2)-i,7i(l + fcu2)-i). 

This function takes values in the conic section 

C ■.= {{x,y) (z F : x^ + ky^ = x} 

with each point in C arising from at most two values of u, and so it suffices to show 
that 

\{iPl,P2,P3,Pi) e :pi +Pi=P2+P3}\ = o{\Ff). 

But from Bezout's theorem we see that each point in F^ can be expressed in at 
most two ways as the sum of two elements in C, and so the left-hand side is 0(|Fp), 
and the claim follows. 

Remark 7.3. The above argument in fact allows us to replace o(l) by Odi^l^*^) 
for some absolute constant c > 0, for the contribution of the non-zero frequencies 
^. Unfortunately, due to the reliance on the multidimensional Szemeredi theorem,, 
we are unable to obtain a similarly strong bound for the contribution of the zero 
frequencies. 
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Appendix A. Some algebraic geometry 

Throughout this appendix, k is an algebraically closed field, and is a finite 
subfield of k. The purpose of this appendix is to review some basic algebraic 
geometry regarding varieties and regular maps over k. 

We begin with the definition of a variety. For the purposes of this paper, we may 
restrict attention to affine varieties for simplicity, but most of the results here can 
be extended to other types of varieties (projective, quasiprojective, etc.). 

Definition A.l (Varieties). An (affine) variety defined over k is a subset C fc" 
of the form 

V={xek" ■.Pi{x) = ---= Pm{x) - 0} 

where n,m are natural numbers, and Pi,...,Pm : k" ^ k are polynomials. We 
say that the variety has complexity at most M if n,m are at most M , and all the 
degrees of Pi, . . . ,Pm are at most M. If furthermore the polynomials Pi, . . . ,Pm 
have coefficients defined over F , we say that V is defined over F (with complexity 
at most M). A variety is (geometrically) irreducible if it cannot be expressed as 
the union of two strictly smaller subvarieties. 

The Zariski closure of a subset E of fc" is defined to be the intersection of all the 
varieties in fc" that contain E. 

The dimension of a non-empty variety V G k" is the largest natural number d 
for which one has a chain 

9CVoC---CVdCV 

of irreducible varieties Vq, . . . ,Vd. We adopt the convention that the empty set has 
dimension —oo. 

We have the following basic upper bound for the number of i^-points on a variety: 

Proposition A. 2 (Schwarz-Zippel bound). Let V C /c™ be an affine variety defined 
over k of complexity at most M and dimension d. Then 

\vnF'^\ <c„,M 1^1''. 

Proof. See for instance jl6i Lemma 1] . One can make the implied constant depend 
linearly on the degree of V, but we will not need this refinement here. □ 

In the case that V is irreducible and defined over F, we have the following 
well-known refinement of Proposition IA.2I 

Proposition A. 3 (Lang- Weil bound). Let V C fc™ be a geometrically irreducible 
affine variety defined over F of complexity at most AI and dimension d. Then 

|FnF™| = (l + 0,„,M(|Fri/2))|F|^. 

In particular, if \F\ is sufficiently large depending on m,M, one has 

\Ff < |FnF™| < \F\'^. 

Proof. See [16\ Theorem 1]. Again, more precise versions of the error term are 
available, but we will not need them here. □ 

Now we recall the notions of regular and dominant maps between varieties. Our 
definition will be somewhat complicated due to the need to assign quantitative 
complexities to such maps. 
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Definition A. 4 (Regular map). Let V C A;" and W C fc™ be ajfine varieties, and 
let M > 1. A map f : V W is said to be regular with complexity at most M 
ifV,W are individually of complexity at most M, and if one can cover V by some 
varieties Vi, . . . ,Vr of complexity at most M for some r < AI such that for each 
1 < J < r, the map f\vj has the form (Pji/Qj,!, . . . , Pj.m/Qj,m), where the Pjj, Qj^i 
are homogeneous polynomial maps from fc"+^ to k with deg{Pjj) = deg{Qjj) < M , 
and the Qjj are non-vanishing on Vj . 

A regular map (j) : V ^ W is dominant if V is irreducible and (j){V) is Zariski- 
dense in W . 

The following proposition asserts (in a certain technical quantitative sense) that 
regular maps are always "essentially dominant" after a reduction in the range, and 
that the fibres of such maps usually have the expected dimension. 

Proposition A. 5 (Quantitative dominance). Let V C k''^,W C fc" be algebraic 
varieties defined over k of complexity at most M , with V irreducible and let (j) : 
V ^ W be a regular map of complexity at most M . Then there exists a subset V' 
ofV and an irreducible subvariety W' ofW of complexity 0^/(1); with the following 
properties: 

(i) (Zariski density) V\V' can be covered by the union of Om{^) varieties of 
complexity Om{^) and dimension strictly less than dim(V). 

(ii) (Controlled image) W is equal to the Zariski closure of4>{V); in particular, 
4> : V ^ W is a dominant map. 

(iii) (Controlled fibres) For each w G W , the set {v € V' : (j){v) = w} can be 
covered by the union of Om{(^) varieties of complexity Om(1) and dimension 
at most dim(y) — dim(VF'). 

Proof. This follows from 7, Lemma 3.7]. □ 

Appendix B. A quantitative multidimensional Szemeredi theorem 

The purpose of this section is to establish the following multidimensional Sze- 
meredi theorem; 

Theorem B.l (Multidimensional Szemeredi theorem). Let G = (G, +) be an ad- 
ditive group, let k,m > 1 be integers, and let A C G™ be a set with \A\ > 5\G\™ . 
Then there are ^k,m.S |G|™+^ tuples (ai, . . . , am, r) G G™+^ with the property that 

(ai + iir, . . . ,a„i + z,„r) £ A 

for all integers ii,.. .,im G {—k, . . . ,k}. 

This is a variant of the multidimensional Szemeredi theorem of Furstenberg and 
Katznelson [S]. There are now many techniques to establish such results; we will 
derive Theorem IB. II from the hypergraph removal lemma established in [T3] , [22] > 

m, m- 

We first observe that Theorem IB . II mav be deduced via a lifting trick from the 
following apparently weaker version: 

Theorem B.2 (Multidimensional Szemeredi theorem, again). Let G = (G, +) be 
an additive group, letm>l be integers, and let A C G™ be a set with \A\ > (5|G|™. 
Then there are ^m,S \G\'^~^^ tuples (a,r) G G'" x G with the property that 

a + rei, . . . , a + rem G A 
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where we adopt the notation that g{ni, . . . , n™) ;= {nig, . . . , Umg) whenever g € G 
and ni, . . . , Um are integers, and ei, . . . , is the standard basis of Z™. 



Indeed, to deduce Theorem iRl] from Theorem|R2l let K {2k + 1)™, and let 
■yi, . . . , be an enumeration of the K m-tuples m fc, . . . , fc}™. If A C G"", we 
let A c 6""+-^ be the set 

A {{a, 6i, . . . , 6k) e G" X : a + hvi H h IkVk e A}. 

If \A\ > (5|G|™, then it is clear (by freezing 6i, . . . , 6^) that |i| > 5|G|™+'^^'. Applyig 
Theorem IB. 21 we see that there are ^k.m,5 tuples {a,bi, . . . ^bx ,r) G q^+k+i g^^^j^ 
that 

(a, 6i, . . . , bi-i,bi + r, . . . , bx) e ^ 
for all 1 < i < iiT, which by definition of A implies that 
(B.l) a' + rv,(EA 

for all i = 1, . . . , K , where a' := a + bivi + • • • + bxVK- Since each a' e G™ arises 
from at most \G\^ tuples {a,bi, . . . jbx), we conclude that there are ^k,m.s tuples 
(a', r) e G™+^ such that (jB.ip holds for aU i = 1, . . . , /C, and the claim follows. 

We now establish Theorem IB. 21 Let G, A, m be as in that theorem. For each 
i = 1,...,TO, we introduce a set Ei C G^^^, defined as the set of all tuples 
(ai, . . . , Om, s) e with the property that 

(Oi, . . . , ai_i, S — fli — • ■ • — ai_i — flj+i — • • • — flm, fli+i, • ■ • , flm) € A. 

Observe that if (ai, . . . , a^, s) lies in the intersection Pli^li -^i of all the Ei, then by 
setting r := s — ai — ■ ■ ■ — am, we have (ai, . . . , a™) + re^ S A for alH = 1, . . . , m. 
Thus it will suffice to show that 

m 

\[\E,\^m,5 iGr+i. 

Let e > be a sufficiently small quantity depending on m, 5 to be chosen later. 
Suppose for sake of contradiction that 



{^E,\<e\GY 



'|m+l 
Ul| ^ C|<_r' 

i=l 

Observe that each Ei is i-invariant in the sense that the assertion that a given tuple 
(ai, . . . , am, s) G G™+^ lies in £'i does not depend on the i**^ coordinate a^. Because 
of this, we may apply the hypergraph removal lemma (see e.g. 26' Theorem 1.13] 
and conclude (if e is small enough depending on m, 5) that there exist «-invariant 
perturbations E[ of Ei with 

(B.2) \E'AE,\ < -|Gr+i 

m 

such that 

(B.3) n^^'=^- 

We now intersect Ei, E[ with the hyperplane 

E := {(ai, . . . ,a„i,ai H h a„j) : ai, . . . , a,„ £ G}. 
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As this hyperplanc sits transversely with respect to the i-invariant set E'j^AEi, we 
conclude from (|B.2|) that 

KK'A^;,) nEI < -icr 

m 

and hence from the union bound and (jB.3p 

ni 

\f]E,nn<S\Gr. 

i=l 

On the other hand, since (ai, . . . , 0^, Oi + ••■ + a„i) £ fXiLiEi fl S whenever 
(oi, . . . , a„i) G A, we have 

m 

|f|£;,nE| > \A\>6\Gr, 

i=l 

giving the desired contradiction. This completes the proof of Theorem IB.2I and 
hence Theorem lB.il 
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